COMPUTABILITY OF THE ENTROPY OF ONE-TAPE TURING 

MACHINES 



EMMANUEL JEANDEL 



. . . Abstract. We prove that the maximum speed and the entropy of a one-tape 

Turing machine are computable, in the sense that we can approximate them 
to any given precision e. This is contrary to popular belief, as all dynamical 
C D ' properties are usually undecidable for Turing machines. The result is quite 

' specific to one-tape Turing machines, as it is not true anymore for two-tape 

Turing machines by the results of Blondel et al., and uses the approach of 
crossing sequences introduced by Hennie. 
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Introduction 

The Turing machine is probably the most weU known of all models of computa- 
tion. This particular model has many variations, that all lead to the same notion 



^ I of computability. The easiest model is the Turing machine with just one tape and 

one head, that we will consider in this paper. 

From the point of view of computability, this model is equivalent to all oth- 
ers. From the point of view of complexity, however, the situation is very different. 
Indeed, it is well known [Hen65, Har68, Tra64] that a language (of finite words) 

■ accepted by such a Turing machine in linear time is always regular. More precisely, 
it can be proven that if such a Turing machine is in time 0{n) on all inputs, then 
there is a constant k so that, on any input, the machine passes at most k times in 

I any given position. 

■ We will consider in this paper the Turing machine as a dynamical system: The 
CO I execution is starting from any given configuration c, ie any initial state, and any 

initial tape, and we will observe the evolution. While the Turing machine is a model 
of computation, it is however quite important in the study of dynamical systems. 
. It was intensively studied by Kurka [Kur97], and Moore [Moo91a, Moo91b] proved 

I that they can be embedded in various "real life" dynamical systems. As an example, 

. the uncomputability of the entropy of a Turing machine, by Blondel et al. [DB04] 

can be use to deduce the uncomputability of the entropy of picewise-affine maps, 
proven by Koiran [KoiOl] in a different way. 

However, these undecidability results are usually obtained for Turing machines 
with two tapes; The basic idea is to use one tape to simulate a given Turing machine 
M, and to control the other tape, that will mostly do nothing computationally 
interesting. The computational complexity of the new Turing machine will come 
from the first tape, but the dynamical complexity will come from the second tape. 

There is a reason why these results use Turing machines with two tapes. We 
will prove indeed that some dynamical quantities for one-tape Turing machines are 
actually computable, in the sense that there is an algorithm that given any e will 
give an approximation upto e. The two quantities we consider are the speed and 
the entropy of a Turing machine. While the most theoretically important quantity 
is the entropy, we will concentrate our discussion in the introduction to the speed, 
which is easiest to conceive. 
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The speed of a Turing machine measures how fast the head goes to infinity. 
Informally, the speed is greater than a if we can find a configuration c for which 
the Turing machine is roughly in position an after n units of time. Note that if 
a is nonzero, this means that it takes us a time n/a = 0{n) to be in position n. 
Now, if we recall a previous result, a Turing machine with one tape with running 
time 0(n) on all inputs docs nothing interesting. We will prove, using the same 
techniques, that this also applies to the maximum speed: If the maximum speed is 
nonzero, hence the running time on some infinite configuration is (asymptotically) 
linear, then there is a regular (ultimately periodic) configuration that achieves this 
maximum speed. 

This paper is organized as follows. In the first section, we introduce the formal 
definitions of the speed and entropy of a Turing machine. In the next section, 
we proceed to prove the three main theorems: The speed and the entropy are 
computable, and the speed is actually a rational number, achieved by a ultimately 
periodic configuration. 

1. Definitions 

We assume the reader is familiar with Turing machines. A (one-tape) Turing 

machine M is a (total) map 5m :(5xI]i— j-QxSx {—1, 0, 1} where Q is a finite 
set called the set of states, and E a finite alphabet. 

Now, the best way to see it as a dynamical system might be unorthodox at first. 
The idea is to consider the Turing Machine as having a moving tape rather than a 
moving head: A configuration is then an element of C = Q x S^, and the map M on 
C is defined as follows: M{{q, c)) = {q', c') where 5m{(1, c(0)) = {q' , a, v), c'{—v) = a 
and c'{i) = c{i + v) for all i ^ —v. This distinction is particularly important for the 
definition of the entropy to be technically correct. However it is best to consider 
the Turing machines as we are used to, and we will say "the Turing machine is in 
position i" rather than "the tape has moved i positions to the right" . 

The speed. Given a configuration c £ C, the speed of M on c is the average 
number of cells that are read per unit of time. Formally, let s„(c) be the number of 
different cells read during the first n steps of the evolution of the Turing Machine 
M on input c. Note that s„ is subadditive : s„+m(c) < s„(c) + SmiM^lc)). 

Definition 1. 

s(c) = lim sup ^"^^^ s(c) = lim inf 

n n 

We give two examples. 

• Consider a Turing machine with two states qi,q2- On qi, the Turing ma- 
chine goes to (j2 without changing the position of the head. On q2 the 
Turing machine goes right and changes back to qi. Then s(c) = s(c) = 1/2 
for all c. 

• Consider a Turing machine with two states {L, R} (for Left and Right) 
and two symbols {a,b}. In state q, when the machine reads a symbol a, 
it goes in the direction q.When the machine reads a symbol b, it writes 
a symbol a instead and changes direction. On input c = (i?, w) where w 
contains only the symbol a, the Turing machine will only go to the right, 
and s(c) = s(c) = 1. On input c = {R,w) where w contains only the 
symbol b, the Turing machine will zigzag, and will reach the n-th symbol 
to the right in time O(n^), hence will see only 0{^y\n)) symbols in time n, 
hence s(c) = s(c) = 0. On input c = {R, w) where w contains b only at all 
positions (—2)', the Turing machine will have read (for n even) 2" -|- 2"~^ 
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Figure 1 . Three different behaviours of the same Turing machine 
on three different inputs. In the first one, the speed is 1. In the 
second one the speed is 0. In the third one, the speed is between 
1/3 and 1/2. Time goes bottom-up 



symbols at time 2"+^ + 2" - 2 and s(c) = 1/2, but only 2"'^ + 2"-^ at 
time 2"+i + 2""^ _ 2, and s(c) = 1/3. This is illustrated on Figure 1. 

Now we define the speed of a Turing machine as the maximum of its average speed 

on all configurations: 

Definition 2. 

S(M) = max s(c) = max s(c) = limsup ^ — inf sup 

cec cec n c n « c n 

The fact that all these definitions are equivalent, and that the maximum speed 
is indeed a maximum (it is reached by some configuration), is a consequence of the 
subadditivity of (s„)„gN, see [FHIO, Theorem 1.1] or [M99] for a more combinatorial 
proof. We give here a direct proof, following [M99] (the main difference is that a 
continuity argument is replaced by a finiteness argument). 

Theorem 1. 

S(AI) — max s(c) — maxs(c) = limsup ^ — inf sup 

Proof. It is clear that the last two quantities are equal (gn = sup^, s„ (c) satisfies 
5n+m < 9n+gm), and that they are larger than the others. Let a = inf„ sup^ ^ir^- 
We will prove that there exists c so that Sn{c) > an for all n, which will prove the 
theorem. By a compactness argument, it is sufficient to prove that for all n, there 
exists c so that Sfc(c) > ak for all k < n. (let c„ be such a configuration, and take 
c to be any limit point of (c„)„>o). 

So suppose it is false for some n. That is for all configurations c, there exists 
k < n so that Sk{c)/k < a. Recall that Sk is an integer between and k, so that the 
left side of this equation, when k and c varies, can take only finitely many values. 
This means there exists /? < a so that for all configurations c, there exists k < n 
so that Sk{c)/k < (3. 
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Now let c be any configuration. There exists < ko < n so that Sko{c) < /?fco- 
Applying the same reasoning to M''°{c), we know there exists < fci < n so that 
SkAM''"{c))<pki, etc. 

By subadditivity of s, this implies that for every to, we have Sm(c) < /3ko + l3ki + 
. . . + Sp(M'^''+'i;i+- -*:g (c)) for some p < n and some q so that ko + . . . kg + p = m. 

Hence Sm(c) < /3to + n. For m = f^;^!, this implies ^^^^j^ < ^7^- This is true of 

all c, so sup^ hsdsl < £±£ < q; a contradiction with the definition of a. □ 

The entropy. The (topological) entropy of a Turing machine is a quantity that 
measures the complexity of the trajectories. It represents roughly the average 
number of bits needed to represent the trajectories. 

For a configuration c, the trace of c is the word u € (S x Q)^ where Ui contains 
the letter in position of the tape and the state at the i-th step during the execution 
of M on input c. We note T{c) the trace of c and T(c)|„ the first n letters of the 
trace. Finally, we denote by T„ = {T(c)|„,c £ C} 

Then the entropy can be defined by 

Definition 3. 

H{M) = lim - log |r„| = inf - log \T„\ 

n n n n 

The limit indeed exists and is equal to the infimum as (log |T„|)„gN is subadditive. 
This definition is a specialized version for (moving tape) Turing machines of the 
general definition of entropy, and was proven equivalent in [Opr06] . 

Let's go back to the examples. In the first case, r(c)|„ can take roughly 
different values, and H{M) = l/21og|I]|. In the second case, it can be proven 
that the first n letters of T(c) can contain at most ^/n symbols (6, L) or {b, R) (the 
maximum is obtained starting from a configuration with only b). As a consequence, 

T„ is of size at most J2i<^ 1 ) ~ ( V~ ) ^^^^ ^{M) — 0. 

It is possible to give a definition for the entropy that is very similar to the speed. 
For this, we use Kolmogorov complexity. The (prefix-free) Kolmogorov complexity 
K{x) of a finite word x is roughly speaking the length of the shortest program that 
outputs X. 

We do not define precisely the Kolmogorov complexity, see e.g. [DHIO], but will 
use mainly the following easy results: 

• For any alphabet S, there exists constants c and c' so that for all words 
u over E, K{u) < \u\ log + 2 log |u| + c and for all words u, v, K{uv) < 
K{u) + K{v)+c'. 

• For any computable function /, there exists a constant c so that K{f{w)) < 
K{'w) + a whenever f{w) is defined. 

For a trace t, define the lower and upper complexity of t by K,(t) ~ lim inf ^'■-^"^ 

and K{t) = lim sup ililLil 

Theorem 2. [Bru83, Sim] 

H{M) = maxK{T{c)) = maxZ(T(c)) 

From this definition, it will not be surprising that we can obtain results on both 
speed and entropy using the same arguments. 
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2. A RELATION BETWEEN THE SPEED AND THE ENTROPY 

While this section is not strictly necessary for our result, it gives some intuition 
about what is happening, and an interesting relation between the two quantities. 

Recall the definitions of the speed and the entropy, in terms of Kolmogorov 
complexity. It is clear that r(c)|„ depends only on the content of the cells visited 
during the n first steps of the Turing machine, and on the initial state. In particular 
if (T(c)|„) < s„(c) log |S| + 0(log s„(c)) 

As a consequence, we obtain 



Proposition 3. For all configurations c, 

s{c) > limsup 



nlog |S| 

In particular 

H{M) 
log|S| 



s(c) > liminf 
S{M) > 



K{T{c)\^) 
nlog |S| 

^(r(c)|„) 



For some Turing machines (as one of the examples), the inequality may be strict: 
this happens when there are some configurations of great speed, but not many of 
them. 

We can overcome this problem as follows. If M is a Turing Machine over the al- 
phabet S and A an other alphabet, we denote by the Turing machine over 
the alphabet Yi x A that works as M works, without changing the A compo- 
nent (formally, the transition function satisfies (?, (a;, a)) = [q' ,{x' ,a),v) if 
5M{q,x) = {q',x',v)). 



Proposition 4. 

iJ(M^) 



> S{M) 



\og\A\ 

Proof. First a few notations. If c is a configuration (a pair of a state and a word 
over S) for the Turing Machine M and w is a. word over the alphabet A, we denote 
hy c(Siw the configuration for the Turing Machine M"^ whose state is the state of c, 
and the letter in position i the pair {ci,Wi). Denote by the set of configurations 
of M"^ and the trace of the machine M^. 

Let n be an integer and c a configuration of maximal speed. Now is of size 
at least as is witnessed by all configurations c0 Hence, 

liog|r„^|>^iog|^| 

By taking a limit on both sides, we get the result. □ 

We obtain the main result of this section: 
Corollary 5. 

5(M)= hm 



|A|-)-oo log |E X ^1 

Proof. Remark that S{M) = S{M^) and that log \A\/ log |Ax S| -)> 1, and combine 
the last two propositions. □ 

To finish, we evaluate more precisely H{M^). For this, we look at T^, the 
first n bits of the trace of M-^. Tn is the first n bits of the trace of M, and is 
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obtained from different initial configurations ci . . . c^. Let En = {ci . . . Cfe} so that 
T„ = {t(c)|„,c e En}, then it is clear that \T,f\ = J2ceE„ 1^1'"'''', and 

^ ceE„ ceE„ 

Now this last definition makes sense also when |^| is not an integer. If we 
replace log |^| by a: in the last definition, we obtain what is called the (topological) 
pressure [FHIO] of (s„)„gN, denoted Ps{x,M). In the context of Turing machines, 
the pressure has therefore a nice interpretation, at least in the case where x is the 
logarithm of an integer. In particular we recover Ps(0,M) = H{M) and we obtain 
lim2,_>+oo Ps (x, M) jx = S{M) , which is also a consequence of a more general result, 
see [FHIO, Theorem 1.2]. 

3. COMPUTABILITY OF THE SPEED AND THE ENTROPY 

We will prove in this section that the speed and the entropy of a Turing machine 
are computable. 

The proof goes as follows. First, by the definition of the speed as an infimum, 
we can compute a sequence s„ so that S{M) — inf s„. So it is sufficient to find a 
(computable) sequence s'n so that S{M) = sup s'n to compute the speed (find a n 

so that Sn — s'n< t) 

To find such a sequence s^, it is sufficient to find configurations Cn of near 
maximal speed. To do that, we need to better understand configurations of maximal 
speed. 

First, we will establish (Propositions 6 and 7) that a configuration of maximal 
speed (entropy) cannot do too many zigzags, and must be only finitely many times 
at any given position. The idea is that revisiting cells that were already visited is 
a loss of time (and complexity), so the machine should avoid doing it. 

In the same vein, we can prove that the zigzags must not be too large (Proposition 
8): the time of the first and last visit of a given cell must be roughly equivalent 
{ln{c) ~ /n(c) in the notation of this proposition). 

All this work allows us to redefine the problem as a graph problem: given a 
weighted (infinite) graph, find the path of minimum average weight (Proposition 
10). Using the graph approach, we will then prove (Theorem 11) that this average 
minimum weight can be well approximated by considering only finite graphs. Fi- 
nally, the speed and entropy for finite graphs are easy to compute (Theorems 12 
and 14), which ends the proof. 

In each section, the proofs will always be done first for the speed, then for the 
entropy. We deliberately choose to have similar proofs in both cases, to help to 
understand the proof for the entropy, which is more complex. In particular, some 
statements about the speed are probably a bit more elaborate than they need to 
be. 

3.1. Biinfinite tapes are no better. The first step in the proof is to simplify 
the model: we will prove that to achieve the maximum speed (resp. maximum 
complexity), we only need to consider configurations that never cross the origin, 
i.e., that stay always on the same side of the tape. This is quite obvious, as changing 
from a position i > to a position j < Q costs at least i + (— j) steps, and might 
greatly reduce the average speed of the TM on this configuration. 

Proposition 6. Let c a configuration for which S{M) = lim„ and suppose 

S{M) > 0. Then, during the computation on input c, the head of M is only finitely 
many times in any given position i. 
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Proof. We prove only the result for i = 0, the result for all i follows by considering 
M*(c) for some suitable t. We suppose by contradiction that the head of M is 
infinitely often in position 0. 

Let k be an integer. As S{M) > 0, there must exist a time t for which the head 
is in position ±k. Let t be the first time when this happens. We may suppose 
w.l.o.g that at time t the head is in position +A;. Now let t'/, be the next time the 
head was in position 0, and finally let tk be the time at which the head was at its 
rightmost position in the first t'j. steps. 

First, by definition st^{c) = •St'^(c). Furthermore, t'^ > ife + st^{c)/2. Indeed by 
definition of t, the leftmost position in the first tk steps is at most —(A; — 1) so we 
went further to the right than to the left in the first tk steps, so that the rightmost 
position is at least in poisiton St^{c)/2. Remark also that tk > k (by definition). 

From this we obtain 



By taking a limsup on both sides we obtain 

S{M) 



SiM)< 

A contradiction. □ 

Proposition 7. Let c be a configuration for which H{M) = lim„ ^^"^^"^^'"^ and 
suppose H{M) > 0. Then for any position i, the head of M is only finitely many 

times in position i. 

Proof. It's exactly the same proof. 

Note that K{T{c^t'J < K{T{c)t^) + ©(logf^) (The first t'^ bits of T(c) can 
be recovered if wo know only the first tk bits, and the number of bits we want 
to recover), and t'^ > + K(T(c)t J/(2 log |E|) + 0(log ffe) (Indeed K{T{c)t,) < 
nlog |S| -|-0(logt^;) where n = st^{c) is the number of bits read during times t <tk, 
and t'i.>tk+ n/2), from which we get the same contradiction. □ 

These two propositions state that we only have to deal with configurations that 
never reach the position i = once they leave it at f = (replace c by Mp{c) for a 
suitable p). 

If we deal with the disjoint union of the Turing machine M and its mirror 
(exchange left and right) M, we may now assume, and we do in the rest of this 
section, that the maximum speed and complexity is reached with a configuration 
that never go to negative positions i <Q and, if S{M) > (resp. H{M) > 0), that 
pass only finitely many times to any given position. 

3.2. A reformulation. Recall that we suppose in the following sections that the 
maximum speed is obtained for a configuration that never go to negative positions. 

Let call /n(c) (/ for first) the first time we reach position n. Then the average 
speed on a configuration c (for which the Turing machine never goes in negative 
positions) can be defined equivalently as lim„ j-j^- We prove now a stronger 
statement. 

Let's call Z„(c) the last time we reach position n. If we never reach position ±n, 
or if we reach it infinitely often, let ln{c) = oo. 

Proposition 8. 

Ti n 
S{M) = max limsup = maxliminf 



n{c) c Z„(c) 
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H ( M ) = max lim sup —r-r-r- — max lim mi — r~ri~ 

If the speed (resp. entropy) is nonzero, the maximum is reached for some con- 
figuration c for which ln{c) is never infinite. In particular, for this configuration, 

ln{c) ^ /„(c) 

Proof. It is clear that S{M) and II{M) are upper bounds, as n < s/„(c)(c) and 
Kic\n) < K{T{c)\f^(^c)) + O(logri). In particular the result is true if S{M) = 
(resp. H{M) = 0). " 

We first deal with the speed. Let c be a configuration of maximum speed. By 
the previous subsection, we may suppose that c never reaches negative positions. 

Let tn = Inic). Let p be the rightmost position the head reaches before t„ and 
t'j^ the first time this position is reached. Note that st^ (c) = Sf (c) = p (no negative 
position is ever reached) 

From this we get lim |f = lim = 1 

Note also that > n and tn > + — n (we are at position Sf = St„ at time 
t'^ and at position n at time t„) 
Hence 

tn tji tji 

From which the result follows. 

For the entropy, the proof is almost the same. From K{T{c)t„) = K{T{c)t'_J + 

0(logi„), we get again that lim„ ^ — 1. 

Now K{T{c)tJ < K{cn) + {tn - 4) log |S| + 0(logt„) (the first t„ bits of T(c) 
can be recovered if we know t„ and the first p bits of c, hence if we know the first n 
bits of c and the p — n > tn — t'^ next bits) , from which the result follows again. □ 

3.3. Crossing sequences. Now the last technical tool we need are crossing se- 
quences, introduced by Hennie [IIcn65]. 

First denote by the set of configurations c on which: 

• The Turing machine never reaches any positions i < 

• The Turing machine never reaches the position again once it leaves it at 
t = 0. 

• For any i > 0, the head of the Turing is only finitely many times in position 
i. 

The last sections prove that we only have to deal with configurations in C^. 

Let c be such a configuration. The crossing sequence at boundary i is the se- 
quences of states of the machine when its head cross the boundary between the i-th 
cell and the i + 1-th cell. We denote by Ci{c) the crossing sequence at boundary i. 
Note that Cq (c) consists of a single state, which is the initial state of c (the machine 
never reaches the position anymore) and Ci{c) is finite for z > 0. 

The main idea of the crossing sequences is that Ci (c) represents all the exchange 
of information between the positions j < i and the positions j > z of the tape. 
In particular, if Ci{c) — Cj{c') for two configurations c, c', and if we consider the 
configuration c that is equal to c upto i then equal to c' (shifted by i — j so that the 
j -\- 1-th cell of c' becomes the i -f 1-th cell of c) , then the Turing machine on c will 
behave exactly like c on all positions less than i, and as c' (shifted) on positions 
more than i. Hence the crossing sequences capture exactly the behaviour of the 
Turing machine. 

We now consider the following labeled graph (automaton) G: The vertices of G 
are all finite words over the alphabet Q (all possible crossing sequences), and there 
is an edge from w to w' labeled by a G S if w and w' are compatible, in the sense 
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that it looks possible to find a configuration and a position i so that Ci{c) — w, 
Ci-f i(c) — w' and a is the letter at position i + 1 in c (said otherwise, w and w' are 
two consecutive crossing sequences for some configuration c). The exact definition 
is as follows. We define recursively two subsets L and i? of Q* x Q* x E as follows: 
o (e,e,a) G L, (e,e,a) G R 

o If (5((7i, a) — ((j2, b, —1) then {qiq2W, w' , a) £ L iS {w, w' , h) £ L 
o If a) — ((j2, 6, +1) then {qiw, q2w' , a) € i iff {w, w' ,b) G R 
o If a) — ((j2, 6, —1) then {q2W, qiw' , a) G R iS (w, w' , b) £ L 
o If (5(gi, a) = ((72, ^, +1) then [w, qiq2'w' , a) G i? iff (u>, w' , b) £ R 

Then there is an edge from w to w' labeled a if and only if {w,w' ,a) G L. 

Note that this echoes a similar definition for two-way finite automata given 

in [HU79, 2.6] where {w,w',a) E L is called "w left-matches w'" (The note in 

Example 2.15 is particularly relevant). The exact definition above is also hinted at 

in [Pig09]. 

Let us explain briefly these conditions. Suppose S{qi,a) = (g2,^, +1), and sup- 
pose that the Turing machine at some point arrives in some cell i from the left, in 
the state qi and sees a. Then by definition, the first symbol from Ci{c) must be 
qi. By definition of the local rule d, the Turing machine will enter state q2 and 
go right so that the first symbol in Ci+i{c) will be q2- Now, the next time the 
Turing machine will come into the cell z, it must be coming from the right, and 
when it does it will see the symbol b. This explains the rule (qiw , q2w' , a) G i iff 
{w,w',b) G R, where w and w' represent the crossing sequences after the second 
time the Turing machines comes to the cell i. 

Now it is clear that a configuration c defines a path in this graph G, and that 
we can recover the speed of the configuration from the graph, as explained in the 
following. 

A path in the graph G is a sequence p = {{wi, Wi)}i<Ar where Wi is a vertex of G 
and Ui a letter from S so that (w^, Wi+i, Ui) G L for all i < A'^ — 1. A valid path is 
an infinite path (A^ = oo) so that wo consists of one single letter (state). We denote 
by ■p(G) the set of valid paths of a graph G. 

The following facts are obvious: 

Fact 9. For any c G , {(Gi(c), Ci)}i>o is a valid path in G. 

Furthermore, for any valid path p = {{wi,Ui)}i>Q, there exists a configuration 
c G so that Ui — Ci and Ci{c) is a prefix of Wi. 

Note that it is indeed possible for Wi to be strictly larger than Ci{c). 
Now we explain how we can redefine the speed on the graph G. 
If p is a finite path (A^ is finite), the length of p is \p\ — N, the weight of p is 
weight{p) ~ J2i<N ^'^"^ ^^"^ complexity oi p is K(j>) = K{uo . . .un^i) 

Ifp ~ (ui, Wi)i>o is an infinite path, and p|„ = {ui, Wi)i<n, the average speed oip 

is s{p) = liminf ^Jgil^^^^^ and the average complexity of p is K{p) = liminf ^^ghllp^^) - 

We define similarly s{p) and K{p). 

Now note that J2i<n bounded from below by the first time we go to 

the position n, and from above by the last time we go to position n. So by the 
previous section 

Proposition 10. 

S(M) = max s(p) = max s(p) 
peV{G) peV(G} 

H(M) = max K(p) — max K(p) 

Now to obtain the main theorems, let Gk be the subgraph of G obtained by 
taking only the vertices of size \wi\ < k. 
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Theorem 11. 

S{M) = sup sup s{p) 
H{M) = sup sup 'K{p) 

k peV{Gk) 

This means we only have to consider finite graphs to compute the speed (resp. 
entropy). We will prove in the next section that the speed and the entropy are 
computable for finite graphs, which will give the result. 

Before going to the proof, some intuition. Let p he a path of maximum speed 
S{M) > 0. For the speed to be nonzero, vertices of large weight cannot be too 
frequent in p. Now the idea is to bypass these vertices (by using other paths in the 
graph G) to obtain a new path p' with almost the same average speed. For the 
speed, it's actually possible to obtain a path p' of the same speed (this will be done 
in the next section). However, for the entropy, it is likely that these paths were 
actually of great complexity so that their removal gives us a path of smallest (yet 
very near) average complexity. 

Proof. First, the speed. One direction is obvious by definition. Wc suppose that 
S{M) > 0, otherwise the result is trivial. Let p be a path of maximum speed. 

Let k be any integer so that 1/fc < S{M). For any vertex w and w' of size less or 
equal to k so that p goes through w and u/ in that order, choose some finite path 
P(w, w') from w to w' . Now let K be an upper bound on the weights of all those 
paths. 

The idea is now simple: we will change p so that it will not go through any 
vertices w of size \w\ > K. 

We do it like this: Whenever there is a vertex w of size greater than K, we will 
look at the last vertex w before it of size less or equal to k, and to the first vertex 
w' after it of size less or equal to k, and we will replace the portion of this path by 
P{w, w'). Let's call p' this new path. Note that there must exist such a vertex w' , 
otherwise all vertices will be of size greater than k after some time, which means 
the speed on p is less than a contradiction. 

Now we prove this construction works. 

Let n be an integer so that for all m > n. 

r-r^^ — - > S{M)/2 

wetght{p\m) 

Now let m, so that the vertex w„i of p is of size less than k. We will look at 
how the m first positions of p where changed into p' . Let m' be the position of the 
vertex Wm in p' {wm still appears in p' as we only change vertices of size greater 
than fc). 

By the above inequality, it is clear that in the m first position of the path p, 
there is at most 2m/{kS{M)) vertices of size greater than fc. This means that this 
portion of the path in p' is of length at least m' > m — 2m/ {kS{M)). Furthermore, 
at each time, we replace a finite path by a path of smallest weight (each path was 
of weight at least K, and each new one is of weight at most K). 

As a consequence, for this new path p' we have 

m' ^ m-2m/{kS{M)) 
weight{p'^^,) " weight{p\m) 

Hence 

s{p') > S{M) - 2/fc 

We have proven that some path in Gk is at least 2/fc to the optimal speed, which 
proves the result. 



COMPUTABILITY OF THE ENTROPY OF ONE-TAPE TURING MACHINES 



11 



The proof for the entropy is, as always, very similar. We start from l/k < 
H{M)/{log |S|)), which guarantees that infinitely many vertices are of weight less 
than k. As before, we will choose K greater than all weights, but now also greater 
than fc^. 

First, K{p\„) < nlog |S| + (9(logn), so K^{p) < s{p) log so we may choose m 
so that for all m > n 

^ > if(M)/(21og|S|) 

weight{p\m) - ^ ' 

To simplify notations, let L = i?(Af)/(2 log |S|)). 

We now have to evaluate K{p'^^,). To recover from pj^/, we only need: 
o To know m 

o To know the positions of the paths that were cut. 

o To know the labels of what was cut 

o To know the labels of what was added instead 

First, the positions. There are at most mL/K vertices of size at least K, so we did 
at most mL/K cuts. The cuts can be described by two sets: the set of beginnings 
of the cuts, and the set of endings of the cuts. Each set is of size at most mL/K. 

For a given size p < m,L/K, there are at most ( rnL/K ) ^^^^ ^^^^ ^' ^^^^ 
can be described by a binary word of size at most the logarithm of this quantity 
(upto a factor O(logm)). 

Second, what was cut. We only cut vertices of size at least fc, so this can be 
described by a single word of size at most [mL/fc)] [log|S|]. 

Third, what was added. At each cut, we added one of the paths P{w,w'). Now 
there are at most words of size at most k, hence are at most 

such paths, and there are at most mL/K cuts, so this can be described by a word 
of size at most \mL/K] 2{k + 1) [log |Q|] 

Thus 



K{p\^,) > Kipm)-2\og ^^'^^ j-lmL/k] riog|Sn-rmi//^l 2(fc+l) riog|Qn-0(logm) 
Now weight{p'^^,) < weight{p\m) and weight{p'^^,) >m'> m(l — L/k) 



J^iP'lm') ^ K{pra) thc same quantity 



weight{p'^^,) weight{pm) m(l - L/k) 
Hence 

. .(«) - (I) - « - ^™^) 

where E{p) = —plogp — {1 — p) log(l — p). Now the quantity to the right tends to 
H{M) when k tends to infinitj^ (recall that we choose K greater than k^, which we 
need for the last term), which proves the result. □ 

3.4. The main theorems. Now we can explain how to use the last result to prove 
the main theorems. 

The idea is that we can compute the speed S{M) from above, by the formula 

5(M) =infsup^^ 

n (, n 

So it is sufficient to explain how to compute it from below and this comes precisely 
from the previous theorem. 
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Theorem 12. There exists an algorithm that, given a Turing machine M and a 
precision e, computes S{M) to a precision e. 

Proof. We only have to explain how to compute the maximum speed for a finite 
graph G. First, we may trim G so that all vertices are reachable from a vertex of 
size 1. It is then obvious that the maximum speed is obtained by a path that goes 
to then follow a cycle of minimum average weight, so the maximum speed is exactly 
the inverse of the minimum average weight. This is easily computable, see [Kar78] 
for an efficient algorithm. □ 

We can say a bit more 

Theorem 13. The maximum speed of a Turing machine S{M) is a rational num- 
ber. It is reached by a configuration which is ultimately periodic. 

Proof. We suppose that S{M) > otherwise the result is clear. 
We will prove that the sequence suppg^^ s{p) is stationary. 
Let = 1 + \l/S{M)'] . Let if = + 1)\Q\''+^ 

Now we look at sup^g^^, s{p) for some K' > K. The maximum is reached for 
some path that reach some cycle of minimum average weight. 

Note that this cycle cannot be of length greater than {k + Indeed, 
denote by m the length of this cycle. As there are at most IQI'^"''^ vertices in this 
cycle of length at most k, the average speed on this cycle is less than 

< 1/fc < S{M) 



(fc + l)(m- 

Now, there cannot be any vertices in this cycle of length at least k{k + 1)\Q\'^'^^. 
otherwise the average speed would be less than 

(^ + 1)1^1'"^ <l/k<SiM) 



fc(fe + l)|Q|'=^ 
Hence this cycle is already in Gk- 

Now if we look at the cycle of minimal average weight in Gk that can be reached 
in G, hence in Gp from some P, then it is clear that S{M) is exactly the inverse 
of the average weight of this cycle, and it is reached for some path p in Gp that 
reaches then follows this cycle. □ 

Note that, while the maximum speed is a rational number, there is no algorithm 
that actually computes this rational number (we are only able to approximate it 
up to any given precision). This can be proven by an adaptation of the proof of 
the undecidability of the existence of a periodic configuration in a Turing machine 
[KO08]. 

Now we do the same for the entropy: 

Theorem 14. There exists an algorithm that, given a Turing machine M and a 
precision e, computes H{M) to a precision e. 

Proof. We only have to explain how to compute the maximum complexity for a 
finite graph G. However, we do not know how to do this in the whole generality. 
We will only prove how to do it for the graphs G^, that have an additional property: 
It is easy to see that they are weakly-deterministic, in the sense, that given two 
vertices w^w' and a word u, there is at most one path from w to w' . 

First we trim Gk so that any vertex of Gk is reachable from a vertex of size 1. 

For a given fc, we consider a set Bk of infinite words over the alphabet (Q x S)U(5 
defined as follows: A word is in Bk if and only if it never contains more than fc — 1 
consecutive letters in Q, never more than 1 consecutive letters in Q x S, and all 
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factors of the form [a, q)w(b, q')w' {c, q') satisfy than there is a vertex from qw to 
q'w' labeled by b. 

Now it it clear that ii p = {(ui, qiWi)}i>o is an infinite path in Gk, then 
{uo,qo)wo{ui,qi)wi . . . is a word of Bk- Conversely, any word of Bk, upto the 
deletion of at most k + 1 letters at its beginning, represents a path in Gk- 

Moreover, K{{uo, qo)wQ . . . (m„, q„)w„) = K{uo . . . u„) + 0(1) = i^(p|„) + 0(1). 
Indeed, we can recover all the states knowing only wq and Wn, as the graph is 
weakly deterministic. Furthermore, the length of {uq, qo)w() . . . (u„, g„)z«„ is exactly 
weight{p\n)- 

This means that the maximum complexity on the graph Gk can be computed 

as: 

K{wo...w„) 
sup lim sup 

And this we know how to compute. Indeed, Bk is what is called a subshift of 
finite type (it is defined by a finite set of forbidden words), for which the above 
quantity is exactly the entropy (!) of Bk [Bru83, Sim], which is easy to compute, 
see e.g., [LM95]. 

To better understand what we did in this theorem, the intuition is as follows: 
Computing the entropy of the trace is difficult, but the trace can be approximated 
by taking only into account configurations for which we cross at most k times the 
frontier between any two consecutive cells. For this approximation Tk of the trace, 
we can reorder the letters inside the trace so that transitions corresponding to the 
same position are consecutive, and this does not change the entropy. However, it 
makes it easier to compute. □ 

Open Problems 

The main open problem is of course to strengthen the last theorem, and actually 
characterise the exact numbers that can arise as entropies of Turing machines. 
It cannot be all nonnegative computable numbers, as an enumeration of Turing 
machines would give us an enumeration of these numbers, which is impossible by 
an easy diagonalisation argument. The natural conjecture is that the supremum in 
the theorem is actually reached, which would prove that the numbers that arises 
as entropies of Turing machines are exactly the numbers that arises as entropies of 
subshifts of finite type, which are well known. 

Finally, the situation for Turing machines with two-tapes is not clear. Of course, 
we know that the speed (resp. entropy) are not computable [DB04] (there is no 
algorithm that given a Turing machine and a precision e computes the speed upto 
e), but we know of no example where the speed (resp. the entropy) is not a rational 
number (resp. a computable real number). 
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